function word
- Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.07)
- North America > Canada (0.05)
Pay Less Attention to Function Words for Free Robustness of Vision-Language Models
Tian, Qiwei, Lin, Chenhao, Zhao, Zhengyu, Shen, Chao
T o address the trade-off between robustness and performance for robust VLM, we observe that function words could incur vulnerability of VLMs against cross-modal adversarial attacks, and propose Function-word De-Attention (FDA) accordingly to mitigate the impact of function words. Similar to differential amplifiers, our FDA calculates the original and the function-word cross-attention within attention heads, and differentially subtracts the latter from the former for more aligned and robust VLMs. Comprehensive experiments include 2 SOTA baselines under 6 different attacks on 2 downstream tasks, 3 datasets, and 3 models. Overall, our FDA yields an average 18/13/53% ASR drop with only 0.2/0.3/0.6% performance drops on the 3 tested models on retrieval, and a 90% ASR drop with a 0.3% performance gain on visual grounding. W e demonstrate the scalability, generalization, and zero-shot performance of FDA experimentally, as well as in-depth ablation studies and analysis. Code will be made publicly available.
- North America > United States (1.00)
- Asia > China > Shaanxi Province > Xi'an (0.40)
A Stylometric Application of Large Language Models
Stropkay, Harrison F., Chen, Jiayi, Latifi, Mohammad J., Rockmore, Daniel N., Manning, Jeremy R.
We show that large language models (LLMs) can be used to distinguish the writings of different authors. Specifically, an individual GPT-2 model, trained from scratch on the works of one author, will predict held-out text from that author more accurately than held-out text from other authors. We suggest that, in this way, a model trained on one author's works embodies the unique writing style of that author. We first demonstrate our approach on books written by eight different (known) authors. We also use this approach to confirm R. P. Thompson's authorship of the well-studied 15th book of the Oz series, originally attributed to F. L. Baum.
- North America > United States > New Hampshire > Grafton County > Hanover (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- (5 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.70)
A Linguistics-Aware LLM Watermarking via Syntactic Predictability
Park, Shinwoo, Park, Hyejin, Ahn, Hyeseon, Han, Yo-Sub
As large language models (LLMs) continue to advance rapidly, reliable governance tools have become critical. Publicly verifiable watermarking is particularly essential for fostering a trustworthy AI ecosystem. A central challenge persists: balancing text quality against detection robustness. Recent studies have sought to navigate this trade-off by leveraging signals from model output distributions (e.g., token-level entropy); however, their reliance on these model-specific signals presents a significant barrier to public verification, as the detection process requires access to the logits of the underlying model. We introduce STELA, a novel framework that aligns watermark strength with the linguistic degrees of freedom inherent in language. STELA dynamically modulates the signal using part-of-speech (POS) n-gram-modeled linguistic indeterminacy, weakening it in grammatically constrained contexts to preserve quality and strengthen it in contexts with greater linguistic flexibility to enhance detectability. Our detector operates without access to any model logits, thus facilitating publicly verifiable detection. Through extensive experiments on typologically diverse languages-analytic English, isolating Chinese, and agglutinative Korean-we show that STELA surpasses prior methods in detection robustness. Our code is available at https://github.com/Shinwoo-Park/stela_watermark.
- North America > United States > New York > Rensselaer County > Troy (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > California (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.96)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
What Do Humans Hear When Interacting? Experiments on Selective Listening for Evaluating ASR of Spoken Dialogue Systems
Mori, Kiyotada, Kawano, Seiya, Liu, Chaoran, Ishi, Carlos Toshinori, Contreras, Angel Fernando Garcia, Yoshino, Koichiro
Spoken dialogue systems (SDSs) utilize automatic speech recognition (ASR) at the front end of their pipeline. The role of ASR in SDSs is to recognize information in user speech related to response generation appropriately. Examining selective listening of humans, which refers to the ability to focus on and listen to important parts of a conversation during the speech, will enable us to identify the ASR capabilities required for SDSs and evaluate them. In this study, we experimentally confirmed selective listening when humans generate dialogue responses by comparing human transcriptions for generating dialogue responses and reference transcriptions. Based on our experimental results, we discuss the possibility of a new ASR evaluation method that leverages human selective listening, which can identify the gap between transcription ability between ASR systems and humans.
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
Improving Natural Language Processing T asks with Human Gaze-Guided Neural Attention: Supplementary Material
To gain further insight into the comparison between our model and the current state of the art in sentence compression, we show results of our method and ablations in relation to ablations of the method by Zhao et al. (see Table 1). In their work, the authors added a "syntax-based Also shown is the number of model parameters. We show that our model, without additional syntactic information as was used in previous methods, still obtains SOT A performance. Figure 1: Additional paraphrase generation attention maps from our ablation study, for both sub-networks (TSM predictions and upstream task attention) in our joint architecture. TSM fixation predictions (left in blue) over epochs (last epoch is our converged models). However, we assume they do not play a role in performance between these two conditions, as these performance differences are not statistically significant.
- Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.07)
- North America > Canada (0.05)
Prior-based Noisy Text Data Filtering: Fast and Strong Alternative For Perplexity
Seo, Yeongbin, Kim, Gayoung, Kim, Jaehyung, Yeo, Jinyoung
As large language models (LLMs) are pretrained on massive web corpora, careful selection of data becomes essential to ensure effective and efficient learning. While perplexity (PPL)-based filtering has shown strong performance, it suffers from drawbacks: substantial time costs and inherent unreliability of the model when handling noisy or out-of-distribution samples. In this work, we propose a simple yet powerful alternative: a prior-based data filtering method that estimates token priors using corpus-level term frequency statistics, inspired by linguistic insights on word roles and lexical density. Our approach filters documents based on the mean and standard deviation of token priors, serving as a fast proxy to PPL while requiring no model inference. Despite its simplicity, the prior-based filter achieves the highest average performance across 20 downstream benchmarks, while reducing time cost by over 1000x compared to PPL-based filtering. We further demonstrate its applicability to symbolic languages such as code and math, and its dynamic adaptability to multilingual corpora without supervision
- North America > United States (0.14)
- Asia > Middle East > Jordan (0.04)
- Education (0.47)
- Health & Medicine (0.46)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
A Joint Multitask Model for Morpho-Syntactic Parsing
Inostroza, Demian, Mistica, Mel, Vylomova, Ekaterina, Guest, Chris, Kurniawan, Kemal
We present a joint multitask model for the UniDive 2025 Morpho-Syntactic Parsing shared task, where systems predict both morphological and syntactic analyses following novel UD annotation scheme. Our system uses a shared XLM-RoBERTa encoder with three specialized decoders for content word identification, dependency parsing, and morphosyntactic feature prediction. Our model achieves the best overall performance on the shared task's leaderboard covering nine typologically diverse languages, with an average MSLAS score of 78.7 percent, LAS of 80.1 percent, and Feats F1 of 90.3 percent. Our ablation studies show that matching the task's gold tokenization and content word identification are crucial to model performance. Error analysis reveals that our model struggles with core grammatical cases (particularly Nom-Acc) and nominal features across languages.
Sparse Autoencoders Can Capture Language-Specific Concepts Across Diverse Languages
Andrylie, Lyzander Marciano, Rahmanisa, Inaya, Ihsani, Mahardika Krisna, Wicaksono, Alfan Farizki, Wibowo, Haryo Akbarianto, Aji, Alham Fikri
Understanding the multilingual mechanisms of large language models (LLMs) provides insight into how they process different languages, yet this remains challenging. Existing studies often focus on individual neurons, but their polysemantic nature makes it difficult to isolate language-specific units from cross-lingual representations. To address this, we explore sparse autoencoders (SAEs) for their ability to learn monosemantic features that represent concrete and abstract concepts across languages in LLMs. While some of these features are language-independent, the presence of language-specific features remains underexplored. In this work, we introduce SAE-LAPE, a method based on feature activation probability, to identify language-specific features within the feed-forward network. We find that many such features predominantly appear in the middle to final layers of the model and are interpretable. These features influence the model's multilingual performance and language output and can be used for language identification with performance comparable to fastText along with more interpretability. Our code is available at https://github.com/LyzanderAndrylie/language-specific-features
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- South America > Brazil (0.04)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- (12 more...)
Dynamik: Syntactically-Driven Dynamic Font Sizing for Emphasis of Key Information
Nishida, Naoto, Ishiguro, Yoshio, Rekiomto, Jun, Yamashita, Naomi
In today's globalized world, there are increasing opportunities for individuals to communicate using a common non-native language (lingua franca). Non-native speakers often have opportunities to listen to foreign languages, but may not comprehend them as fully as native speakers do. To aid real-time comprehension, live transcription of subtitles is frequently used in everyday life (e.g., during Zoom conversations, watching YouTube videos, or on social networking sites). However, simultaneously reading subtitles while listening can increase cognitive load. In this study, we propose Dynamik, a system that reduces cognitive load during reading by decreasing the size of less important words and enlarging important ones, thereby enhancing sentence contrast. Our results indicate that Dynamik can reduce certain aspects of cognitive load, specifically, participants' perceived performance and effort among individuals with low proficiency in English, as well as enhance the users' sense of comprehension, especially among people with low English ability. We further discuss our methods' applicability to other languages and potential improvements and further research directions.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Europe > Italy > Sardinia > Cagliari (0.06)
- North America > United States > New York > New York County > New York City (0.06)
- (39 more...)
- Media (1.00)
- Education (1.00)
- Government > Regional Government > North America Government > United States Government (0.68)
- (3 more...)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)